Discriminative Fuzzy Clustering Maximum a Posterior Linear Regression for Speaker Adaptation
نویسندگان
چکیده
We propose a discriminative fuzzy clustering maximum a posterior linear regression (DFCMAPLR) model adaptation approach to compensate the acoustic mismatch due to speaker variability. The DFCMAPLR approach adopts the MAP criterion and a discriminative objective function to estimate shared affine transform and fuzzy weight sets, respectively. Then, through a linear combination of the calculated fuzzy weights and shared affine transforms, more specific affine transforms are formed for model adaptation. By incorporating the MAP criterion and the discriminative information, DFCMAPLR can calculate shared affine transforms reliably and enhance the discriminative power of the adapted acoustic model. Based on the experimental results on the ASTTEL200 Mandarin corpus, we verified that DFCMAPLR outperforms not only the conventional maximum likelihood linear regression (MLLR) but also the fuzzy clustering MLLR(FCMLLR), which estimates the shared affine transform and fuzzy weight sets both based on the maximum likelihood criterion. Moreover, when compared to the baseline result, DFCMAPLR provides a clear improvement of 9.86% (24.04% to 21.67%) relative average phone error rate (PER) reduction.
منابع مشابه
Discriminative adaptation for log-linear acoustic models
Log-linear models have recently been used in acoustic modeling for speech recognition systems. This has been motivated by competitive results compared to systems based on Gaussian models, and a more direct parametrisation of the posterior model. To competitively use log-linear models for speech recognition, important methods, such as speaker adaptation, have to be reformulated in a log-linear f...
متن کاملSpeaker Adaptation for Continuous Density HMMs: A Review
This paper reviews some popular speaker adaptation schemes that can be applied to continuous density hidden Markov models. These fall into three families based on MAP adaptation; linear transforms of model parameters such as maximum likelihood linear regression; and speaker clustering/speaker space methods such as eigenvoices. The strengths and weaknesses of each adaptation family are discussed...
متن کاملUsing maximum likelihood linear regression for segment clustering and speaker identification
Many adaptation scenarios rely on clustering of either the test or training data. Although consistency between the clustering and adaptation objective functions is desired, most previous approaches have not implemented such consistency. This paper shows that the statistics used in Maximum Likelihood Linear Regression (MLLR) adaptation are su cient to cluster data with a consistent Maximum Likel...
متن کاملDiscriminative speaker adaptation with conditional maximum likelihood linear regression
We present a simplified derivation of the extended Baum-Welch procedure, which shows that it can be used for Maximum Mutual Information (MMI) of a large class of continuous emission density hidden Markov models (HMMs). We use the extended Baum-Welch procedure for discriminative estimation of MLLR-type speaker adaptation transformations. The resulting adaptation procedure, termed Conditional Max...
متن کاملSmoothing Factor in Discriminative Feature Adaptation
In these days, Discriminative Training (DT) methods of an acoustics model are taking over the leadership in the speaker recognition task for training an acoustics model. Maximum Likelihood (ML) training suffers from some inaccuracies because of improper assumptions of the suitability of the HMM. Well-known adaptation method, feature Maximum Likelihood Linear Regression (fMLLR), is based on ML c...
متن کامل